tien_nemo

Merge branch '2025/ntctien/14567_edit_template' into 'dev'

2025/ntctien/14567 edit template

See merge request !4
1 -<p align="center"><a href="https://laravel.com" target="_blank"><img src="https://raw.githubusercontent.com/laravel/art/master/logo-lockup/5%20SVG/2%20CMYK/1%20Full%20Color/laravel-logolockup-cmyk-red.svg" width="400"></a></p> 1 +- **run project**
2 - 2 + php artisan serve
3 -<p align="center"> 3 + http://127.0.0.1:8000/ocr
4 -<a href="https://travis-ci.org/laravel/framework"><img src="https://travis-ci.org/laravel/framework.svg" alt="Build Status"></a> 4 +
5 -<a href="https://packagist.org/packages/laravel/framework"><img src="https://img.shields.io/packagist/dt/laravel/framework" alt="Total Downloads"></a> 5 +- **create database**
6 -<a href="https://packagist.org/packages/laravel/framework"><img src="https://img.shields.io/packagist/v/laravel/framework" alt="Latest Stable Version"></a> 6 + CREATE TABLE `mst_template` (
7 -<a href="https://packagist.org/packages/laravel/framework"><img src="https://img.shields.io/packagist/l/laravel/framework" alt="License"></a> 7 + `id` INT(11) NOT NULL AUTO_INCREMENT,
8 -</p> 8 + `tpl_name` VARCHAR(50) NOT NULL COLLATE 'utf8mb4_general_ci',
9 - 9 + `tpl_text` VARCHAR(50) NOT NULL COLLATE 'utf8mb4_general_ci',
10 -## About Laravel 10 + `tpl_xy` VARCHAR(50) NOT NULL COLLATE 'utf8mb4_general_ci',
11 - 11 + `in_date` DATETIME NOT NULL DEFAULT current_timestamp(),
12 -Laravel is a web application framework with expressive, elegant syntax. We believe development must be an enjoyable and creative experience to be truly fulfilling. Laravel takes the pain out of development by easing common tasks used in many web projects, such as: 12 + `up_date` DATETIME NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
13 - 13 + PRIMARY KEY (`id`) USING BTREE,
14 -- [Simple, fast routing engine](https://laravel.com/docs/routing). 14 + UNIQUE INDEX `tpl_name` (`tpl_name`) USING BTREE
15 -- [Powerful dependency injection container](https://laravel.com/docs/container). 15 + )
16 -- Multiple back-ends for [session](https://laravel.com/docs/session) and [cache](https://laravel.com/docs/cache) storage. 16 + COLLATE='utf8mb4_general_ci'
17 -- Expressive, intuitive [database ORM](https://laravel.com/docs/eloquent). 17 + ENGINE=MyISAM
18 -- Database agnostic [schema migrations](https://laravel.com/docs/migrations). 18 + AUTO_INCREMENT=11
19 -- [Robust background job processing](https://laravel.com/docs/queues). 19 + ;
20 -- [Real-time event broadcasting](https://laravel.com/docs/broadcasting). 20 +
21 - 21 + CREATE TABLE `dt_template` (
22 -Laravel is accessible, powerful, and provides tools required for large, robust applications. 22 + `tpl_detail_id` INT(11) NOT NULL AUTO_INCREMENT,
23 - 23 + `tpl_id` INT(11) NOT NULL,
24 -## Learning Laravel 24 + `field_name` VARCHAR(50) NULL DEFAULT NULL COLLATE 'utf8mb4_general_ci',
25 - 25 + `field_xy` VARCHAR(50) NULL DEFAULT NULL COLLATE 'utf8mb4_general_ci',
26 -Laravel has the most extensive and thorough [documentation](https://laravel.com/docs) and video tutorial library of all modern web application frameworks, making it a breeze to get started with the framework. 26 + `in_date` DATETIME NOT NULL DEFAULT current_timestamp(),
27 - 27 + `up_date` DATETIME NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
28 -If you don't feel like reading, [Laracasts](https://laracasts.com) can help. Laracasts contains over 1500 video tutorials on a range of topics including Laravel, modern PHP, unit testing, and JavaScript. Boost your skills by digging into our comprehensive video library. 28 + PRIMARY KEY (`tpl_detail_id`) USING BTREE,
29 - 29 + INDEX `tpl_id` (`tpl_id`) USING BTREE
30 -## Laravel Sponsors 30 + )
31 - 31 + COLLATE='utf8mb4_general_ci'
32 -We would like to extend our thanks to the following sponsors for funding Laravel development. If you are interested in becoming a sponsor, please visit the Laravel [Patreon page](https://patreon.com/taylorotwell). 32 + ENGINE=MyISAM
33 - 33 + AUTO_INCREMENT=7
34 -### Premium Partners 34 + ;
35 - 35 +
36 -- **[Vehikl](https://vehikl.com/)** 36 +
37 -- **[Tighten Co.](https://tighten.co)** 37 +
38 -- **[Kirschbaum Development Group](https://kirschbaumdevelopment.com)** 38 +- /ocrpdf/app/Services/OCR/read_pdf.py : read pdf
39 -- **[64 Robots](https://64robots.com)**
40 -- **[Cubet Techno Labs](https://cubettech.com)**
41 -- **[Cyber-Duck](https://cyber-duck.co.uk)**
42 -- **[Many](https://www.many.co.uk)**
43 -- **[Webdock, Fast VPS Hosting](https://www.webdock.io/en)**
44 -- **[DevSquad](https://devsquad.com)**
45 -- **[Curotec](https://www.curotec.com/services/technologies/laravel/)**
46 -- **[OP.GG](https://op.gg)**
47 -- **[WebReinvent](https://webreinvent.com/?utm_source=laravel&utm_medium=github&utm_campaign=patreon-sponsors)**
48 -- **[Lendio](https://lendio.com)**
49 -
50 -## Contributing
51 -
52 -Thank you for considering contributing to the Laravel framework! The contribution guide can be found in the [Laravel documentation](https://laravel.com/docs/contributions).
53 -
54 -## Code of Conduct
55 -
56 -In order to ensure that the Laravel community is welcoming to all, please review and abide by the [Code of Conduct](https://laravel.com/docs/contributions#code-of-conduct).
57 -
58 -## Security Vulnerabilities
59 -
60 -If you discover a security vulnerability within Laravel, please send an e-mail to Taylor Otwell via [taylor@laravel.com](mailto:taylor@laravel.com). All security vulnerabilities will be promptly addressed.
61 -
62 -## License
63 -
64 -The Laravel framework is open-sourced software licensed under the [MIT license](https://opensource.org/licenses/MIT).
......
1 +from paddleocr import PaddleOCR
2 +from pdf2image import convert_from_path
3 +import os
4 +import time
5 +import numpy as np
6 +import json
7 +from pathlib import Path
8 +import cv2
9 +from table_detector import detect_tables
10 +
11 +# ==== Config ====
12 +BASE_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "..", ".."))
13 +PDF_NAME = 'aaaa'
14 +
15 +# PDF path
16 +pdf_path = Path(BASE_DIR) / "storage" / "pdf" / "fax.pdf"
17 +# Output folder
18 +output_folder = Path(BASE_DIR) / "public" / "image"
19 +
20 +#PDF_NAME = pdf_path.stem # Get the stem of the PDF file
21 +#print(PDF_NAME)
22 +
23 +os.makedirs(output_folder, exist_ok=True)
24 +
25 +timestamp = int(time.time())
26 +img_base_name = f"{PDF_NAME}_{timestamp}"
27 +
28 +# ==== OCR Init ====
29 +ocr = PaddleOCR(
30 + use_doc_orientation_classify=False,
31 + use_doc_unwarping=False,
32 + use_textline_orientation=False
33 +)
34 +
35 +# ==== PDF to Image ====
36 +pages = convert_from_path(pdf_path, first_page=1, last_page=1)
37 +image_path = os.path.join(output_folder, f"{img_base_name}.jpg")
38 +pages[0].save(image_path, "JPEG")
39 +
40 +# ==== Run OCR ====
41 +image_np = np.array(pages[0])
42 +results = ocr.predict(image_np)
43 +
44 +# ==== Convert polygon to bbox ====
45 +def poly_to_bbox(poly):
46 + xs = [p[0] for p in poly]
47 + ys = [p[1] for p in poly]
48 + return [int(min(xs)), int(min(ys)), int(max(xs)), int(max(ys))]
49 +
50 +# ==== Build ocrData ====
51 +ocr_data_list = []
52 +for res in results:
53 + for text, poly in zip(res['rec_texts'], res['rec_polys']):
54 + bbox = poly_to_bbox(poly)
55 + ocr_data_list.append({
56 + "text": text,
57 + "bbox": bbox,
58 + "field": "",
59 + "hideBorder": False
60 + })
61 +
62 +# ==== Detect table ====
63 +table_info = detect_tables(image_path)
64 +
65 +# ==== Build JSON ====
66 +final_json = {
67 + "ocr_data": ocr_data_list,
68 + "tables": table_info
69 +}
70 +
71 +
72 +# ==== Save JSON ====
73 +json_path = os.path.join(output_folder, f"{PDF_NAME}_{timestamp}_with_table.json")
74 +with open(json_path, "w", encoding="utf-8") as f:
75 + json.dump(final_json, f, ensure_ascii=False, indent=2)
76 +
77 +print(f"Saved OCR + Table JSON to: {json_path}")
1 +import cv2
2 +import numpy as np
3 +import os
4 +
5 +def detect_tables(image_path):
6 + img = cv2.imread(image_path)
7 + if img is None:
8 + raise FileNotFoundError(f"Không đọc được ảnh: {image_path}")
9 +
10 + gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
11 + blur = cv2.GaussianBlur(gray, (3, 3), 0)
12 +
13 + # Edge detection
14 + edges = cv2.Canny(blur, 50, 150, apertureSize=3)
15 +
16 + # --- Horizontal lines ---
17 + lines_h = cv2.HoughLinesP(edges, 1, np.pi/180, threshold=120,
18 + minLineLength=int(img.shape[1] * 0.6), maxLineGap=20)
19 + ys_candidates, line_segments = [], []
20 + if lines_h is not None:
21 + for l in lines_h:
22 + x1, y1, x2, y2 = l[0]
23 + if abs(y1 - y2) <= 3: # ngang
24 + y_mid = int(round((y1 + y2) / 2))
25 + ys_candidates.append(y_mid)
26 + line_segments.append((x1, x2, y_mid))
27 +
28 + # gom nhóm các y
29 + ys, tol_y = [], 10
30 + for y in sorted(ys_candidates):
31 + if not ys or abs(y - ys[-1]) > tol_y:
32 + ys.append(y)
33 +
34 + total_rows = max(0, len(ys) - 1)
35 +
36 + # --- Vertical lines ---
37 + lines_v = cv2.HoughLinesP(edges, 1, np.pi/180, threshold=100,
38 + minLineLength=int(img.shape[0] * 0.5), maxLineGap=20)
39 + xs = []
40 + if lines_v is not None:
41 + for l in lines_v:
42 + x1, y1, x2, y2 = l[0]
43 + if abs(x1 - x2) <= 3:
44 + xs.append(int(round((x1 + x2) / 2)))
45 +
46 + # gom nhóm cột
47 + x_pos, tol_v = [], 10
48 + for v in sorted(xs):
49 + if not x_pos or v - x_pos[-1] > tol_v:
50 + x_pos.append(v)
51 +
52 + total_cols = max(0, len(x_pos) - 1)
53 +
54 + tables = []
55 + if len(ys) >= 3 and line_segments:
56 + y_min, y_max = ys[0], ys[-1]
57 + min_x = min(seg[0] for seg in line_segments)
58 + max_x = max(seg[1] for seg in line_segments)
59 + table_box = (min_x, y_min, max_x, y_max)
60 +
61 + rows = []
62 + for i in range(len(ys) - 1):
63 + row_box = (min_x, ys[i], max_x, ys[i+1])
64 + rows.append({"row": tuple(int(v) for v in row_box)})
65 + cv2.rectangle(img, (row_box[0], row_box[1]), (row_box[2], row_box[3]), (0, 255, 255), 2)
66 +
67 + tables.append({
68 + "total_rows": int(total_rows),
69 + "total_cols": int(total_cols),
70 + "table_box": tuple(int(v) for v in table_box),
71 + "rows_box": rows
72 + })
73 +
74 + cv2.rectangle(img, (min_x, y_min), (max_x, y_max), (255, 0, 0), 3)
75 +
76 + debug_path = os.path.splitext(image_path)[0] + "_debug.jpg"
77 + cv2.imwrite(debug_path, img)
78 +
79 + return tables
1 +
2 +body {
3 + font-family: sans-serif; background: #f5f5f5;
4 +}
5 +
6 +#app {
7 + display: flex; gap: 20px; padding: 20px;
8 +}
9 +
10 +.right-panel {
11 + width: 500px; background: #fff; padding: 15px;
12 + border-radius: 8px; box-shadow: 0 0 5px rgba(0,0,0,0.1);
13 +}
14 +
15 +.form-group {
16 + margin-bottom: 15px;
17 +}
18 +
19 +.form-group label {
20 + font-weight: bold; display: block; margin-bottom: 5px;
21 +}
22 +
23 +.form-group input {
24 + width: 100%;
25 + padding: 6px;
26 + border: 1px solid #ccc;
27 + border-radius: 4px;
28 +}
29 +
30 +.left-panel {
31 + flex: 1;
32 + position: relative;
33 + background: #eee;
34 + border-radius: 8px;
35 + overflow: hidden;
36 + user-select: none;
37 +}
38 +
39 +.pdf-container {
40 + position: relative; display: inline-block;
41 +}
42 +
43 +.bbox {
44 + position: absolute;
45 + border: 2px solid #ff5252;
46 + /*background-color: rgba(255, 82, 82, 0.2);*/
47 + cursor: pointer;
48 +}
49 +
50 +.bbox.active {
51 + border-color: #199601 !important;
52 + background-color: rgba(25, 150, 1, 0.4) !important;
53 +}
54 +
55 +@keyframes focusPulse {
56 + 0% { transform: scale(1); }
57 + 50% { transform: scale(1.05); }
58 + 100% { transform: scale(1); }
59 +}
60 +
61 +select {
62 + position: absolute;
63 + z-index: 10;
64 + background: #fff;
65 + border: 1px solid #ccc;
66 +}
67 +
68 +.select-box {
69 + position: absolute;
70 + /*border: 2px dashed #2196F3;*/
71 + background-color: rgba(33, 150, 243, 0.2);
72 + pointer-events: none;
73 + z-index: 5;
74 +}
75 +
76 +.delete-btn {
77 + position: absolute;
78 + top: 50%;
79 + right: -35px;
80 + transform: translateY(-50%);
81 + cursor: pointer;
82 + padding: 3px 6px;
83 + z-index: 20;
84 +}
85 +
86 +
87 +.edge {
88 + position: absolute;
89 + z-index: 25;
90 +
91 +}
92 +
93 +.edge.top, .edge.bottom {
94 + height: 8px;
95 + cursor: ns-resize;;
96 +}
97 +.edge.left, .edge.right {
98 + width: 8px;
99 + cursor: ew-resize;
100 +}
101 +
102 +.edge.top {
103 + top: -4px;
104 + left: 0;
105 + right: 0;
106 +}
107 +
108 +.edge.bottom {
109 + bottom: -4px;
110 + left: 0;
111 + right: 0;
112 +}
113 +
114 +.edge.left {
115 + top: 0;
116 + bottom: 0;
117 + left: -4px;
118 +}
119 +
120 +.edge.right {
121 + top: 0;
122 + bottom: 0;
123 + right: -4px;
124 +}
125 +
126 +.corner {
127 + position: absolute;
128 + width: 14px;
129 + height: 14px;
130 + background: transparent;
131 + border: 2px solid transparent; /* mặc định trong suốt, chỉ tô 2 cạnh */
132 + z-index: 30;
133 + opacity: .95;
134 + transition: border-width .08s ease, transform .08s ease, opacity .08s ease;
135 + pointer-events: auto; /* bắt sự kiện kéo resize */
136 +}
137 +
138 +
139 +/* Mỗi góc hiện 2 cạnh + bo tròn đúng góc */
140 +.corner.top-left {
141 + top: -8px; left: -8px;
142 + /*border-left-color: var(--corner-color);*/
143 + /*border-top-color: var(--corner-color);*/
144 + border-top-left-radius: 6px;
145 + cursor: nwse-resize;
146 +}
147 +
148 +.corner.top-right {
149 + top: -8px; right: -8px;
150 + /*border-right-color: var(--corner-color);*/
151 + /*border-top-color: var(--corner-color);*/
152 + border-top-right-radius: 6px;
153 + cursor: nesw-resize;
154 +}
155 +
156 +.corner.bottom-left {
157 + bottom: -8px; left: -8px;
158 + /*border-left-color: var(--corner-color);*/
159 + /*border-bottom-color: var(--corner-color);*/
160 + border-bottom-left-radius: 6px;
161 + cursor: nesw-resize;
162 +}
163 +
164 +.corner.bottom-right {
165 + bottom: -8px; right: -8px;
166 + /*border-right-color: var(--corner-color);*/
167 + /*border-bottom-color: var(--corner-color);*/
168 + border-bottom-right-radius: 6px;
169 + cursor: nwse-resize;
170 +}
171 +
172 +/* Hiệu ứng khi hover – dày hơn, rõ hơn */
173 +.corner:hover {
174 + border-width: 3px;
175 + opacity: 1;
176 + transform: scale(1.02);
177 +}